Skip to content

[v7r3] Tokens auth#5045

Closed
TaykYoku wants to merge 170 commits into
DIRACGrid:integrationfrom
TaykYoku:v7r2-pre33_tokens
Closed

[v7r3] Tokens auth#5045
TaykYoku wants to merge 170 commits into
DIRACGrid:integrationfrom
TaykYoku:v7r2-pre33_tokens

Conversation

@TaykYoku

@TaykYoku TaykYoku commented Mar 18, 2021

Copy link
Copy Markdown
Contributor

This PR is an updated version of #4650, all comments from the old PR were taken into account here. The main update is the use of /Core/Tornado things, documentation is torn from #4650 and will be updated.

This PR introduces the AuthN/AuthZ mechanism to DIRAC based on the use of Identity Provider
services using OAuth2/OIDC protocols.

Few remarks about the implementation.

The user AuthN/AuthZ paradigm changes. Before we collected all the necessary user information
in the CS Registry and then used it to identify and authorize client requests. Now we can use
external identity providers dynamically. Each time a user is starting a DIRAC session, the
authentication can be delegated to an external identity provider. If authentication is successful,
then actual user profile information is cached in a dynamic session object which is created
for each active user. This session is used for further identification of user requests. Therefore,
DIRAC CS Registry becomes just one of the possible sources of the user profile information. Other
sources are (Federated) Identity Providers, e.g. Check-In, Indigo AIM, Google, etc, VOMS.

The use of X509 certificates is no more the only way for user identification. Therefore,
certificate DN's are no more the primary user identifiers. Now there can be also user IDs
from various identity providers. Within DIRAC, user name becomes the main, single and unique
user identifier independent on various user identities from different providers. As a consequence,
all the CS options with user identities as values, e.g. pilot user, should be expressed in terms
of user names rather than DNs. For backward compatibility current options in a form of DN are
still accepted but should be replaced eventually.

There can be now DIRAC users that do not have usual personal certificates and will identified
via OAuth2/OIDC mechanism. However, in order to ensure the work of the DISET protocol still based
on the use of X509 certificate proxies, the us of Proxy Provider services is enabled. These
services can create a certificate proxy on demand to be used by users in CLI interfaces and
by DIRAC services for performing operation on the users' behalf. Current solution is DIRAC Proxy
Provider using DIRAC CA certificates to generate user proxies (for the DIRAC internal use only).
Solution using RCauth Proxy Provider service is tested and will be available in later PRs.
Users having usual X509 certificates will continue to use those in a usual way.

Identity Providers and Proxy Providers are added as new types of Resources.

Although this PR introduces new functionalities, the current mechanisms of X509 based user
authentication are still maintained. Installations and users using old good X509 certificates
will continue to work without any changes.

More about the important changes:

  1. AuthServer:
    There has been implemented AuthServer(FrameworkSystem/private/authorization/AuthServer.py) using authlib library, it is sewn into the AuthHandler endpoint essentially simulates activity of an identity provider.
  2. AuthManagerService:
    This service has been created to accumulate and update user profile information and tokens from Identity providers in AuthDB and AuthManagerData cache.
  3. DISET:
    Has been modified to work with user identification from Identity Providers that must be registred in CFG, just like DNs.
    • AuthManager(authorizeBySession method)
    • ThreadConfig
    • RequestHandler
    • BaseClient
    • TornadoBaseClient
  4. TornadoServer:
    Has been slightly modified to use it for WebApp and REST APIs.
    • HandlerManager can collects endpoints and portal
    • added base class TornadoREST for the REST handlers
    • create aBaseRequestHandler class containing the main code for TornadoREST and TornadoService, where has been added authentication with OAuth tokens
  5. REST APIs:
    This project requires a REST interface, so some REST APIs were transferred from the RESTDIRAC project and integrated with the TornadoServer framework ([v7r2] HTTPs services within DIRAC #4677):
    • src/DIRAC/FrameworkSystem/API/AuthHandler.py
    • src/DIRAC/FrameworkSystem/API/ProxyHandler.py
    • src/DIRAC/ConfigurationSystem/API/ConfigurationHandler.py
    • as an example:
from DIRAC.Core.Tornado.Server.TornadoREST import TornadoREST

class EndpointNameHandler(TornadoREST):
  AUTH_PROPS = "all"
  LOCATION = "/DIRAC"
  METHOD_PREFIX = 'web_'

  path_endpoint = ['([a-z]+)']
  def web_endpoint(self, key):
     """ REST endpoint
           **GET** /endpoint/<key>?<options>
     """
  1. Registry:
    Since this project is associated with dynamic data (user information can be updated), it is decided to use the cache for its temporary accumulation:

    • ProxyManagerService has been a little modified to collect VOMS user data.
    • created separate classes for caching the received information:
      • ProxyManagerData -- VOMS information data
      • AuthManagerData -- identity providers information data
    • Registry changed to read the cache with the second priority after the DIRAC CFG
  2. Resources:

    • added new resource IdProvider to describe identity providers
    • added new providers:
      • OAuth2IdProvider -- provide OAuth2 authentication
      • OAuth2ProxyProvider -- provide the proxy generation with OAuth2 authentication(e.g.: RCAuth)
  3. Third-party packages are required:

    • authlib==0.15.3
    • termcolor

BEGINRELEASENOTES

NEW: Multiple changes to provide OAuth2/OIDC user AuthN/AuthZ and OAuth session management.
CHANGE: multiple changes to use username instead of DN where applicable.
Changes affect the following Systems: Transformation, Configuration, DataManagement, Framework, RequestManagement, WorkloadManagement. Also tests, Core, Interfaces, Resources.

*ConfigurationSystem
NEW: Registry - use cached data from the AuthManager and ProxyManager clients, modify methods in new logic context
FIX: Resources - split difficult method
FIX: Utilities - fix method name and path
NEW: Add REST API

*Core
NEW: Move the WebApp "core part" to the DIRAC, try to use existing tornado framework
NEW: Add DB version
NEW: DictCache - add getDict method
CHANGE: AuthManager - split authorization logic, fix test
CHANGE: RequestHandler - fill credDict by AuthManager methods
CHANGE: align with the PR changes, use ID as IdP user ID in DISET transport flow

*FrameworkSystem
NEW: Add AuthManager service, DB, client and client with caching IdP information data, REST API
NEW: ProxyManager - split client part to parts with VOMS information cache data and simple client, add REST API
CHANGE: ProxyManager - modify to use user/group in requests
NEW: dirac-proxy-init - able to use authentication flow through the Identity providers
NEW: halo - new class to use spinners for waiting process, for ex. waiting authentication

*RequestManagementSystem
CHANGE: Request - add owner parameter, use AuthManager methods
CHANGE: align with the PR changes

*WorkloadManagementSystem
NEW: Add User/pilotUser parameter
CHANGE: align with the PR changes

*TransformationSystem
CHANGE: align with the PR changes

*DataManagementSystem
CHANGE: align with the PR changes

*Resources
NEW: add OAuth identity and proxy providers
CHANGE: align with the PR changes

*Interfaces
CHANGE: align with the PR changes

*tests
CHANGE: align with the PR changes

ENDRELEASENOTES

Comment thread src/DIRAC/Interfaces/API/DiracAdmin.py
Comment thread src/DIRAC/Interfaces/API/DiracAdmin.py Outdated
Comment thread src/DIRAC/Interfaces/API/DiracAdmin.py Outdated
Comment thread tests/Integration/Framework/Test_ProxyDB.py Outdated
@chaen

chaen commented Mar 30, 2021

Copy link
Copy Markdown
Contributor

Following our discussion last Friday, what I would like to see is a graphic representation of the various components and how they interact with each other, for the different scenario. For example the whole exchange of what happens when someone does dirac-proxy-init. It would also help if it was clearly visible on this graphic what are the new services/components you introduced. This would really help in seeing how a service like CheckIn or IAM would integrate and what DIRAC components it would replace

@TaykYoku

TaykYoku commented Apr 1, 2021

Copy link
Copy Markdown
Contributor Author

Here is a simple diagram where there are basic things. I still work on other more detail schema.
CLIAuthFlow
WebAuthFlow

@TaykYoku TaykYoku force-pushed the v7r2-pre33_tokens branch from 5f37085 to a8d6687 Compare April 4, 2021 16:24

@fstagni fstagni left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a partial review. I did not yet have a proper study at many of the important changes that have been done here.

Thare are present two important imports, that provide caching data::

* :mod:`ProxyManagerData <FrameworkSystem.Client.ProxyManagerData>` caches information from VOMS
* :mod:`AuthManagerData <FrameworkSystem.Client.AuthManagerData>` caches information from IdPs

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module, as well as all the other modules in ConfigurationSystem/Client/Helpers/ should IMO be used to only interact with the information found in CS. What you are building here is a "UserIdentity client/discovering tool". I would prefer this is a specific, different module.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree that the Registry is part of ConfigurationSystem and if I want to use some external source for using by Registry I need sync it with CS sections/options, but such synchronization imposes some additional difficulties, such as a possibility of an error at simultaneous updating with other commit, it is necessary to consider authorship of this or that record, etc.
Another option is to create, as you say, a separate module and use the Registry there. In this case, need to review all the code where the Registry is used and most likely replace it almost everywhere with a newly created module.

I can move the Registry to DIRAC/Core/?

Comment thread src/DIRAC/Resources/IdProvider/IdProviderFactory.py Outdated
Comment thread src/DIRAC/Core/Base/DB.py
authorizeBySession(credDict, logObj=self.log)
else:
authorizeByCertificate(credDict, logObj=self.log)
return credDict

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While there are only a few modifications to this file, I think we agreed to not develop dip (so, DISET) any further.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately these changes were made long before this decision, I will probably review these changes again to see if they can be avoided

Comment thread src/DIRAC/Core/DISET/private/BaseClient.py
Comment thread src/DIRAC/Core/Utilities/Proxy.py Outdated
Comment thread src/DIRAC/Core/Utilities/Proxy.py
Comment thread src/DIRAC/FrameworkSystem/Service/ProxyManagerHandler.py Outdated
Comment thread src/DIRAC/FrameworkSystem/Utilities/halo.py
csSection = PathFinder.getServiceSection('Framework/ProxyManager')
if requestedUsername != credDict['username'] or requestedUserGroup != credDict['group']:
return S_ERROR("You can't get %s@%s proxy!" % (credDict['username'], credDict['group']))
elif not gConfig.getValue('%s/downloadablePersonalProxy' % csSection, False):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the right way to get service-specific options. But anyway, what's the rationale for this option?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

downloadablePersonalProxy flag that allow/deny to download personal proxy from the Proxy RESTful endpoint(DIRAC/FrameworkSystem/API/ProxyHandler.py) using authorisation with tokens.

I agree that this check is out of place here:
moved this flag in configuration from Services/ProxyManager to APIs/Proxy, as it decides before sending a request to the ProxyManager service and moved everything related to this from the ProxyMnager service to the Proxy RESTful endpoint. Added the isDownloadablePersonalProxy method to DIRAC/ConfigurationSystem/Client/Utilities.py to read this flag.

@TaykYoku TaykYoku force-pushed the v7r2-pre33_tokens branch from 0270fc3 to 8ecaf81 Compare April 15, 2021 18:22
@TaykYoku TaykYoku force-pushed the v7r2-pre33_tokens branch from 4091d4c to 37a750a Compare May 4, 2021 11:20


def getDIRACClient():
""" Get registred in the configuration DIRAC authentication client

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

registred -> registered (several instances of that)



def getAuthorisationServerMetadata():
""" Get authoraisation server metadata

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

authoraisation -> authorization (and in general, sadly, let's use "Authorization" not "Authorisation" (e.g. name of this method...)

Comment on lines +677 to +688
data['jwks_uri'] = data.get('jwks_uri', data['issuer'] + '/jwk')
data['token_endpoint'] = data.get('token_endpoint', data['issuer'] + '/token')
data['userinfo_endpoint'] = data.get('userinfo_endpoint', data['issuer'] + '/userinfo')
data['registration_endpoint'] = data.get('registration_endpoint', data['issuer'] + '/register')
data['authorization_endpoint'] = data.get('authorization_endpoint', data['issuer'] + '/authorization')
data['grant_types_supported'] = data.get('grant_types_supported', [
'code', 'authorization_code', 'urn:ietf:params:oauth:grant-type:device_code', 'refresh_token'
])
data['response_types_supported'] = data.get('response_types_supported', [
'code', 'device', 'id_token token', 'id_token', 'token'
])
data['code_challenge_methods_supported'] = data.get('code_challenge_methods_supported', ['S256'])

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my info... are these "standards"? Where do they come from?

@TaykYoku TaykYoku May 20, 2021

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it comes from Authorization Server Metadata.
Endpoints are defined by FrameworkSystem.API.AuthHandler methods.
it will support two types of authorization, DeviceFlow and AuthorizationCode.

self.__group = tp[1]
if tp[2]:
self.__setup = tp[2]
self.__ID = tp[3] or self.__ID

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this after
self.__setup = tp[2] or self.__setup please

Comment on lines +1 to +5
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
# $HeadURL$
__RCSID__ = "$Id$"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...just empty file is fine.

Comment on lines +87 to +93
def __getRPC(self):
""" Get RPC
"""
if not self.rpc:
from DIRAC.Core.Base.Client import Client
self.rpc = Client()._getRPC(url="Framework/AuthManager", timeout=10)
return self.rpc

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not calling directly the client?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid additional imports that are available in AuthManagerClient. The problem is that AuthManagerData is used by the Registry, and it in turn can be imported anywhere. Therefore, to avoid possible looping, AuthManagerData imports the minimum required.

Comment on lines +103 to +110
servCrash = self.__service.get('crash')
if servCrash and servCrash[1] > 2:
return servCrash[0]
result = self.__getRPC().getIdProfiles(userID)
# If the AuthManager service is down client will ignore it 1 minute
if result.get('Errno', 0) == 1112:
crash = self.__service.get('crash')
self.__service.add('crash', 60, value=(result, (crash[1] + 1) if crash else 1))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain what's this about?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here AuthManagerData waits for the AuthManager service to "wake up".
If the service does not respond twice, then within a minute the client will return the error without contacting the service. So as not to wait for a timeout every time

self.updateProfiles(uid, data)
return result

def getIDsForDN(self, dn, provider=None):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have more than one ID per DN?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is unlikely, but still possible

proxyToConnect=proxyToConnect,
token=token)
if not getVOMSAttributeForGroup(group):
gLogger.verbose("No voms attribute assigned to group %s when requested proxy" % group)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
gLogger.verbose("No voms attribute assigned to group %s when requested proxy" % group)
gLogger.verbose("No VOMS attribute assigned to group", group)

""" ProxyManagerClient has the function to "talk" to the ProxyManager service. Also, when requesting information
about users, this information is cached in a separate class
:mod:`ProxyManagerData <FrameworkSystem.Client.ProxyManagerData>`, and is used, in the Registry for example,
to reduce the number of requests to the server part

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The signatures of the methods in this client largely changed. Given that this is a client, maaaybe it's OK, but not so much.
Other methods were removed, and for these ones do you think that it would be better to use deprecated?

atsareg pushed a commit that referenced this pull request Sep 5, 2021
[v7r3] login with tokens (first part of the #5045)
@chrisburr

Copy link
Copy Markdown
Member

@TaykYoku Can this PR be closed now that #5149 has been merged? I hope we can have smaller and easier to discuss PRs for the future changes now the foundations have been added.

@TaykYoku

TaykYoku commented Sep 7, 2021

Copy link
Copy Markdown
Contributor Author

@chrisburr That's right, that's what #5149 was created for.

@TaykYoku TaykYoku closed this Sep 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants